A Document Graph Based Query Focused Multi-Document Summarizer

نویسندگان

  • Sibabrata Paladhi
  • Sivaji Bandyopadhyay
چکیده

This paper explores the research issue and methodology of a query focused multidocument summarizer. Considering its possible application area is Web, the computation is clearly divided into offline and online tasks. At initial preprocessing stage an offline document graph is constructed, where the nodes are basically paragraphs of the documents and edge scores are defined as the correlation measure between the nodes. At query time, given a set of keywords, each node is assigned a query dependent score, the initial graph is expanded and keyword search is performed over the graph to find a spanning tree identifying relevant nodes satisfying the keywords. Paragraph ordering of the output summary is taken care of so that the output looks coherent. Although all the examples, shown in this paper are based on English language, we show that our system is useful in generating query dependent summarization for nonEnglish languages also. We also present the evaluation of the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Document Summarization using Automatic Key-Phrase Extraction

The development of a multi-document summarizer using automatic key-phrase extraction has been described. This summarizer has two main parts; first part is automatic extraction of Key-phrases from the documents and second part is automatic generation of a multidocument summary based on the extracted key-phrases. The CRF based Automatic Keyphrase extraction system has been used here. A document g...

متن کامل

Answering Questions from Multiple Documents - the Role of Multi-Document Summarization

Ongoing research work on Question Answering using multi-document summarization has been described. It has two main sub modules, document retrieval and Multi-document Summarization. We first preprocess the documents and then index them using Nutch with NE field. Stop words are removed and NEs are tagged from each question and all remaining question words are stemmed and then retrieve the most re...

متن کامل

A Query Focused Multi Document Automatic Summarization

The present paper describes the development of a query focused multi-document automatic summarization. A graph is constructed, where the nodes are sentences of the documents and edge scores reflect the correlation measure between the nodes. The system clusters similar texts having related topical features from the graph using edge scores. Next, query dependent weights for each sentence are adde...

متن کامل

A Query-Focused Multi-Document Summarizer

This paper presents our work on queryfocused multi-document summarization with the enhanced IS_SUM system. We focus on improving its lexical chain algorithm for efficiency enhancement, applying the WordNet for similarity calculation and adapting it to query-focused multi-document summarization. We present its performance in terms of its official DUC2007 evaluation results together with some oth...

متن کامل

Context-based Multi-Document Summarization using Fuzzy Coreference Cluster Graphs

Constructing focused, context-based multi-document summaries requires an analysis of the context questions, as well as their corresponding document sets. We present a fuzzy cluster graph algorithm that finds entities and their connections between context and documents based on fuzzy coreference chains and describe the design and implementation of the ERSS summarizer implementing these ideas.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008